TimeBank-Driven TimeML Analysis
نویسندگان
چکیده
The design of TimeML as an expressive language for temporal information brings promises, and challenges; in particular, its representational properties raise the bar for traditional information extraction methods applied to the task of text-to-TimeML analysis. A reference corpus, such as TimeBank, is an invaluable asset in this situation; however, certain characteristics of TimeBank—size and consistency, primarily—present challenges of their own. We discuss the design, implementation, and performance of an automatic TimeML-compliant annotator, trained on TimeBank, and deploying a hybrid analytical strategy of mixing aggressive finitestate processing over linguistic annotations with a state-of-the-art machine learning technique capable of leveraging large amounts of unannotated data. The results we report are encouraging in the light of a close analysis of TimeBank; at the same time they are indicative of the need for more infrastructure work, especially in the direction of creating a larger and more robust reference corpus. 1
منابع مشابه
Analysis of TimeBank as a Resource for TimeML Parsing
We present an analysis of the TimeBank corpus—the only available reference for TimeML-compliant annotation—from the point of view of its utility as a training resource for developing automated TimeML annotators. Experimental results indicative of the potential of TimeBank are encouraging; at the same time, closer inspection of causes for some systematic errors shows certain deficiencies in the ...
متن کاملTRIOS-TimeBank Corpus: Extended TimeBank Corpus with Help of Deep Understanding of Text
TimeBank (Pustejovsky et al, 2003a), a reference for TimeML (Pustejovsky et al, 2003b) compliant annotation, is widely used temporally annotated corpus in the community. It captures time expressions, events, and relations between events and event and temporal expression; but there is room for improvements in this hand-annotated widely used TimeBank corpus. This work is one such effort to extend...
متن کاملIncreasing Informativeness in Temporal Annotation
In this paper, we discuss some of the challenges of adequately applying a specification language to an annotation task, as embodied in a specific guideline. In particular, we discuss some issues with TimeML motivated by error analysis on annotated TLINKs in TimeBank. We introduce a document level information structure we call a narrative container (NC), designed to increase informativeness and ...
متن کاملFrench TimeBank: An ISO-TimeML Annotated Reference Corpus
This article presents the main points in the creation of the French TimeBank (Bittar, 2010), a reference corpus annotated according to the ISO-TimeML standard for temporal annotation. A number of improvements were made to the markup language to deal with linguistic phenomena not yet covered by ISO-TimeML, including cross-language modifications and others specific to French. An automatic preanno...
متن کاملKorean TimeML and Korean TimeBank
Many emerging documents usually contain temporal information. Because the temporal information is useful for various applications, it became important to develop a system of extracting the temporal information from the documents. Before developing the system, it first necessary to define or design the structure of temporal information. In other words, it is necessary to design a language which ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005